Improved Reconstruction of Protolanguage Word Forms

نویسندگان

  • Alexandre Bouchard-Côté
  • Thomas L. Griffiths
  • Dan Klein
چکیده

We present an unsupervised approach to reconstructing ancient word forms. The present work addresses three limitations of previous work. First, previous work focused on faithfulness features, which model changes between successive languages. We add markedness features, which model well-formedness within each language. Second, we introduce universal features, which support generalizations across languages. Finally, we increase the number of languages to which these methods can be applied by an order of magnitude by using improved inference methods. Experiments on the reconstruction of ProtoOceanic, Proto-Malayo-Javanic, and Classical Latin show substantial reductions in error rate, giving the best results to date.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protolanguage in ontogeny and phylogeny

We approach the issue of holophrasis versus compositionality in the emergence of protolanguage by analyzing the earliest combinatorial constructions in child, bonobo, and chimpanzee: messages consisting of one symbol combined with one gesture. Based on evidence from apes learning an interspecies visual communication system and children acquiring a first language, we conclude that the potential ...

متن کامل

Glottochronology and problems of protolanguage reconstruction

<<Djhf_j =ehllhojhgheh]bybijh[e_fuijZyaudh\hcj_dhgkljmdpbb

متن کامل

Holistic or Synthetic Protolanguage: Evidence from Iterated Learning Ofwhistled Signals

Many arguments have been proposed in favor of and against the idea of protolanguage as a set of holistic utterances that were later segmented into words. This paper presents data from a human iterated learning experiment which corroborates arguments in favor of holistic protolanguage and forms a counterexample to some of the arguments that were proposed against it. This experiment involves iter...

متن کامل

The Relative Divergence of Dutch Dialect Pronunciations from their Common Source: An Exploratory Study

In this paper we use the Reeks Nederlandse Dialectatlassen as a source for the reconstruction of a ‘proto-language’ of Dutch dialects. We used 360 dialects from locations in the Netherlands, the northern part of Belgium and French-Flanders. The density of dialect locations is about the same everywhere. For each dialect we reconstructed 85 words. For the reconstruction of vowels we used knowledg...

متن کامل

An Application of Computer Programming to the Reconstruction of a Proto-Language

l.Purpose. This paper illustrates the use of a computer program as a tool in linguistic research. The program under consideration produces a concordance on words according to phonological segments and environments. Phonological segments are defined as a predetermined aet of consonants and vowels. An environment is defined as the locus of occurrence of any of the phonological segments. The conco...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009